Combining nearest neighbor classifiers versus cross-validation selection.
نویسندگان
چکیده
Various discriminant methods have been applied for classification of tumors based on gene expression profiles, among which the nearest neighbor (NN) method has been reported to perform relatively well. Usually cross-validation (CV) is used to select the neighbor size as well as the number of variables for the NN method. However, CV can perform poorly when there is considerable uncertainty in choosing the best candidate classifier. As an alternative to selecting a single "winner,'' we propose a weighting method to combine the multiple NN rules. Four gene expression data sets are used to compare its performance with CV methods. The results show that when the CV selection is unstable, the combined classifier performs much better.
منابع مشابه
Complete Cross-Validation for Nearest Neighbor Classifiers
Cross-validation is an established technique for estimating the accuracy of a classifier and is normally performed either using a number of random test/train partitions of the data, or using kfold cross-validation. We present a technique for calculating the complete cross-validation for nearest-neighbor classifiers: i.e., averaging over all desired test/train partitions of data. This technique ...
متن کاملACO Based Feature Subset Selection for Multiple k-Nearest Neighbor Classifiers
The k-nearest neighbor (k-NN) is one of the most popular algorithms used for classification in various fields of pattern recognition & data mining problems. In k-nearest neighbor classification, the result of a new instance query is classified based on the majority of k-nearest neighbors. Recently researchers have begun paying attention to combining a set of individual k-NN classifiers, each us...
متن کاملCombining Nearest Neighbor Classifiers Through Multiple Feature Subsets
Combining multiple classi ers is an e ective technique for improving accuracy. There are many general combining algorithms, such as Bagging or Error Correcting Output Coding, that signi cantly improve classi ers like decision trees, rule learners, or neural networks. Unfortunately, many combining methods do not improve the nearest neighbor classi er. In this paper, we present MFS, a combining a...
متن کاملHuman Activity Recognition Using Inertial/Magnetic Sensor Units
This paper provides a comparative study on the different techniques of classifying human activities that are performed using bodyworn miniature inertial and magnetic sensors. The classification techniques implemented and compared in this study are: Bayesian decision making (BDM), the least-squares method (LSM), the k-nearest neighbor algorithm (k-NN), dynamic time warping (DTW), support vector ...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Statistical applications in genetics and molecular biology
دوره 3 شماره
صفحات -
تاریخ انتشار 2004